We are running multiple individual prometheus stack on multiple EKS clusters, so each has its on own alert manager running. And sending the alerts to a service directory using prometheus integration (incident_key) in alert-manager config.
Which looks like -
- name: pagerduty-demo
pagerduty_configs:
- service_key: ‘xxx40bdb9xxxx09c07792xxxx’
And have tried some workaround with group_by parameters, but still it wouldn’t solve the issue. The issue is that - For multiple clusters, alerts are getting grouped in a single incident. A single incident is having - xyz alert for multiple clusters which is getting complex to keep track of critical alerts. Can someone help me on this how this can be resolved ?